All Questions
Tagged with randomtext-processing
14 questions
1vote
4answers
665views
How to randomly extract a substring of 200 characters from a fasta file
Is there any Linux command one can use to extract a sequence from a file? For instance, a file contains one million lines, and we want to randomly sample only a sequence of 200 characters from that ...
0votes
2answers
425views
generate a password with minimum 2 numbers, 2 CAPS, 1 symbol, x lower alpha and max length of 16
I tried pwgen and makepasswd to generate a password... but these are not enough to generate with exact count of literals and numbers... pwgen -c -n -y -B 12 1 This is not working as expected as I ...
1vote
3answers
478views
Replace the last alphanumeric with a random char of same type [closed]
I have a simple list of alphanumeric machine names in a file, and I want to change the last letter or number, but I want it to match character type -- so if the last character is a digit, it should be ...
-3votes
1answer
4kviews
random test log generator [closed]
I need to write a script to create random test log data. Log data should be delimited with "|"(pipe) and 10,000 lines with 500 bytes, and each line and contain following format with random data. ...
1vote
3answers
895views
How to randomly subset a file and then select the same line numbers from multiple files
I have a file that contains 3494 lines, of which I would like to randomly select 100, and write those lines to a new file. I can do that using this: shuf -n 100 input_file.txt output_file.txt However,...
3votes
3answers
1kviews
Generate four random words from a list for XKCD-like passwords
For start, we know how can we output random lines from a txt file: perl -MList::Util -e 'print List::Util::shuffle <>' words.txt But we need a general solution (perl is usually installed on ...
9votes
2answers
8kviews
shuf file --output file: in-place shuffling
The shuf command has an --output flag that you can use to specify where to write its output (instead of writing to stdout). I want to shuffle a file in-place. Is it safe to use shuf file --output ...
2votes
1answer
938views
Split text file in random halves based on category
I have a text file that looks like this: n03250847/n03250847_0.JPEG n03250847 n03250847/n03250847_1.JPEG n03250847 ... n03250847/n03250847_499.JPEG n03250847 ... n03255030/n03255030_0.JPEG n03255030 ...
2votes
2answers
327views
Extract random sample of N lines based on pattern
I have a file formatted like this: train/t/temple/east_asia/00000025.jpg 94 train/t/temple/east_asia/00000865.jpg 94 ... train/s/swamp/00000560.jpg 92 train/s/swamp/00000935.jpg 92 .... train/m/...
9votes
1answer
3kviews
Shuffle two parallel text files
I have two sentence-aligned parallel corpora (text files) with about 50 mil words. (from the Europarl corpus -> parallel translation of legal documents). I'd now like to shuffle the lines of the two ...
1vote
3answers
173views
Random sampling and outputting the largest value
I have a fairly large data set ~500 million rows. The data set looks like below. Col 1 is float number, col 2 is mac id(device id) 1616.93,ac:22:0b:a6:22:c3 2872.32,c0:bd:d1:36:bb:49 3314.55,d4:0b:1a:...
5votes
4answers
3kviews
How to fill a file with a stream from /dev/urandom with a specified number of lines ?
I am trying to fill a file with a sequence of random 0 and 1s with a user-defined number of lines and number of characters per line. the first step is to get a random stream of 0 and 1s: cat /dev/...
6votes
2answers
7kviews
random permutation of lines of text
If I have a file with following content: 0001 0002 0003 0004 0132 0137 0138 0141 How can I get a random permutation of them in bash?
12votes
4answers
2kviews
Shuffle file randomly with some additional constraints
I have a huge music playlist and, while some artists have many albums, others have just one song. I wanted to sort the playlist so the same artist won't play twice in a row, or his songs won't end up ...